Towards Opinion Summarization from Online Forums

نویسندگان

  • Ying Ding
  • Jing Jiang
چکیده

Summarizing opinions expressed in online forums can potentially benefit many people. However, special characteristics of this problem may require changes to standard text summarization techniques. In this work, we present our initial attempt at extractive summarization of opinionated online forum threads. Given the nature of user generated content in online discussion forums, we hypothesize that besides relevance, text quality and subjectivity also play important roles in deciding which sentences are good summary sentences. We therefore construct an annotated corpus to facilitate our study of extractive summarization of online discussion forums. We define a set of features to capture relevance, text quality and subjectivity, and empirically test their usefulness in choosing summary sentences. Using unpaired Student’s t-test, we find that sentence length and number of sentiment words have high correlations with good summary sentences. Finally we propose some simple modifications to a standard Integer Linear Programming based summarization framework to incorporate these features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thread Specific Features are Helpful for Identifying Subjectivity Orientation of Online Forum Threads

Subjectivity analysis has been actively used in various applications such as opinion mining of customer reviews in online review sites, question-answering in CQA sites, multi-document summarization, etc. However, there has been very little focus on subjectivity analysis in the domain of online forums. Online forums contain huge amounts of user-generated data in the form of discussions between f...

متن کامل

An Approach for Online Analysis using Expectation Maximization

Opinion rich web resources such as discussion forums, review sites and blogs which are bulky and are available in digital form. For the purpose of customer and business perspective, the task of scanning these reviews manually is computational burden. Hence, to process reviews automatically and summarizing them in suitable form is more efficient. The distinguished problem of producing opinion su...

متن کامل

Gold Standard Online Debates Summaries and First Experiments Towards Automatic Summarization of Online Debate Data

Usage of online textual media is steadily increasing. Daily, more and more news stories, blog posts and scientific articles are added to the online volumes. These are all freely accessible and have been employed extensively in multiple research areas, e.g. automatic text summarization, information retrieval, information extraction, etc. Meanwhile, online debate forums have recently become popul...

متن کامل

Towards Argumentative Opinion Mining in Online Discussions

Online discussion forums (Figure 1) typically manifest into tree-like structures that are reminiscent of argument trees. Whilst these discussion forums contain a wealth of information related to people’s opinions they also include implicit argumentation information. However unlike argument trees any relationship between posts in a discussion tree remains implicit. In recent years there has been...

متن کامل

Evaluative Pattern Extraction for Automated Text Generation

Getting travel tips from the experienced bloggers and online forums has been one of the important supplements to the travel guidebook in the web society. In this paper we present a novel approach by identifying and extracting evaluative patterns, providing a different linguistically-motivated framework for automated evaluative text generation. We target at domain-specific observation in online ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015